The Expected Value of a Random Variable X is the sum of the product of each possible value1 of X times the Probability of seeing that value. In the discrete case (and youβll see all these as ways of notating the same idea),
E[X]β=βxP(X=x)=βxP(x)=βxfXβ(x)=βxf(x)β
Where fXβ(x) is the probability distribution2 of X. Youβd use the fancy β« in place of the humble β above in the continous case.
Linearity and Constancyβ
The Expected Value of a number 42 isβ¦ drumrollβ¦ 42. This may seem silly to state but thereβs a Linearity property thatβs pretty important. If X is a Random Variable and a and b are some constants,
E[a+bX]=E[a]+E[bX]=a+bE[X]
This is a Very Nice Thing to use in proofs and computations. a shifts and b scales E[X].
Important Thingsβ
E[E[X]]=E[X]. May seem obvious to a lot of people but not to yours truly because I overthink things. E[X] has been computed and is not a Random Variable!
What is the Expected Value of a Dice Roll?
3.5
OK what is the Expected Value of the Expected Value of a Dice Roll?
We just did that. Still 3.5, the Expected Value of a Dice Roll⦠are you okay?
β‘
Now this oneβs a doozy: if Y is another Random Variable, E[E[Yβ£X]]=E[Y]. How can that be?
Consider this: What is the Expected Value of height H in 145 people you picked at random and where all heights are equally likely?
E[H]β=i=1β145βhiββ
P(H=hiβ)=i=1β145βhiββ
1451β=1451ββhiββ
Now you ask: What is the Expected Value of the height given a Random Variable Sex, Sβ{Male,Female,Intersex}? This is a simple conditional probability. Remembering the Expected Value is the sum of products of realizations and their probabilities,
E[Hβ£S]=βhβ
P(H=hβ£S)
Now E[Hβ£S] is still a random variable because we havenβt specified a value for S (i.e., we havenβt βcollapsedβ it to a specific thing like E[Hβ£S=Female]). So once again, remembering that Expected Value is the sum of products of all values of S and their probabilities,
E[E[Hβ£S]]β=sβ{M,F,I}ββ[βhβ
P(H=hβ£S=s)]=βhβ
P(H=hβ£S=Male)+βhβ
P(H=hβ£S=Female)+βhβ
P(H=hβ£S=Intersex)β
So youβre getting the Expected Value of the height across everyone in S, which is simply E[H] π₯³ This is very nice when we get to Variance and Covariance!
Variance and Covarianceβ
The Variance of a Random Variable X is how much we expect it to deviate from its Expected Value and is a Random Variable itself3. We square it first because we want a nice positive number.
Var(X)β=E[(XβE[X])2]=E[X2β2β
Xβ
E[X]+(E[X])2]=E[X2]βE[2β
Xβ
E[X]]+E[(E[X])2]=E[X2]β2β
E[X]β
E[E[X]]+(E[X])2=E[X2]β2β
E[X]β
E[X]+(E[X])2=E[X2]β2β
(E[X])2+(E[X])2=E[X2]β(E[X])2β
The Law of the Unconscious Statistician (LOTUS)β
Itβs simple (and highly useful) enough in practice. If g(X) is some function of Random Variable X,
E[g(X)]=β[g(x).fXβ(x)]
I do not completely understand the proof but here it is for reference.